RTS Model V2 Performance Analysis

TODO:

  • Possible to remove extra influence of multiple RTS within one tile?
  • Figure out why my IoU scores don’t quite match Yili’s - I’ve spent some time on this and am still uncertain.
  • Add statistical tests for boxplots
  • Shapley analysis

Set-Up

Load Libraries

Prep Google Drive Authentication

Define Functions

make_my_dir

pred_from_googledrive

rast_from_googledrive

json_from_googledrive

get_rast_id

bbox

pred_as_poly

val_as_poly

input_as_df

recall_precision

avg_precision

get_rast_id

eval_expression

trim_outliers

assign_conf_stars

Prep Plot Variables

Load Data

Polygons

## Reading layer `rts_polygons_for_Yili_May_2022_v2' from data source 
##   `/home/hrodenhizer/Documents/permafrost_pathways/rts_mapping/rts_data_comparison/data/rts_polygons/rts_polygons_for_Yili_May_2022_v2.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 138 features and 11 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -139.218 ymin: 68.99102 xmax: 124.304 ymax: 72.99794
## Geodetic CRS:  WGS 84

Maxar GeoTiffs

20230601-161139-04062023

## []

Planet GeoTiffs

20230604-211027-p04062023

## []

Sentinel GeoTiffs

20220605-000543-p05062023

## []

Get Tile Bounding Boxes

Convert Predictions to Vector

Join Prediction Polygons into polys SF Dataframe

Convert Validation to Vector

Join Validation Polygons into polys SF Dataframe

Interactive Map of Features

IoU

Calculate Intersection and Union

Calculate IoU

## # A tibble: 3 × 2
##   imagery    mean_iou
##   <chr>         <dbl>
## 1 Maxar          0.66
## 2 Planet         0.64
## 3 Sentinel-2     0.64

Mean Average Precision

Recall-Precision Curves

MAP

Plot

Precision measures false positives. Recall measures false negatives.

Performance by Feature Size

Calculate Area

Size Distribution by Region

## Warning: attribute variables are assumed to be spatially constant throughout
## all geometries

## # A tibble: 2 × 5
##   yg          mean_size min_size max_size median_size
##   <chr>           <dbl>    <dbl>    <dbl>       <dbl>
## 1 Other          20702.      484   107280       10484
## 2 Yamal/Gydan    11290.      512    47548        5732

Plot

Raw IoU Scores:

This is complicated by the fact that the rts_area column is calculated from the raster validation layer, which may contain several RTS features within one tile. Use rts_area (from the original RTS delineation), instead.

Run nls models and bootstrap parameters

Bootstrap predictions for plotting the nls models

Plot the Size/Performance plot

Active vs. General RTS Performance

## # A tibble: 3 × 7
##   imagery    p_val x_pos star_y_pos label_y_pos p_label         star_label
##   <fct>      <dbl> <dbl>      <dbl>       <dbl> <chr>           <chr>     
## 1 Maxar      0.962   1.5        0.9        0.95 p-value = 0.962 ""        
## 2 Planet     0.458   1.5        0.9        0.95 p-value = 0.458 ""        
## 3 Sentinel-2 0.528   1.5        0.9        0.95 p-value = 0.528 ""

Plot

It is possible to get rid of the inner panel borders, if I decide that looks better: https://stackoverflow.com/questions/46220242/ggplot2-outside-panel-border-when-using-facet

Drivers of Unexpected RTS Prediction Performance

Classify Features Using Confidence Interval Approach

This approach first uses the 95% CI of the model parameters to determine whether RTS features were predicted better or worse than expected based on the model. Next, the threshold at which RTS size doesn’t impact IoU is determined from where the slope of the model approaches 0 (currently using slope < 1e-06) for each imagery type. RTS features smaller than this threshold that were predicted better than expected are analyzed later to determine why some small RTS can be identified from the imagery.

Convert Input Data to DF

Zonal Statistics

Plot

These plots summarize the input data values (mean or standard deviation) in RTS cells, background cells, and the normalized difference between the two (Delta = (RTS - Background)/Background). Most of the input layer names should be self explanatory, but for the others:

lum = luminance = 0.299*r + 0.587*g + 0.114*b sr = shaded relief